This report performs SVD-based alignment analysis between router vectors and expert weight matrices.
Plot Explanations
The following section explains what each plot type shows and how it is computed. All plots use layer numbers (e.g., L5, L10) in their legends, without timestamps.
Comparison Plots
Comparison Plots: This figure contains four subplots comparing multiple analysis runs:
- Alignment vs k: Shows the mean alignment score (projection energy) as a function of k (number of top singular vectors used).
- Formula: align(k) = Σᵢ₌₁ᵏ (vᵢᵀ · r)², where vᵢ are the top-k right singular vectors from SVD of expert weight matrix, and r is the normalized router vector.
- Interpretation: The alignment score is the sum of squared projections of the router vector onto the top-k right singular vectors of the expert weight matrix. For k=1, this equals cos²(θ) between the router vector and the top singular vector. For k>1, it sums the squared projections across multiple singular vectors, measuring how much of the router vector's energy lies in the top-k dimensional subspace of the expert.
- Computation: (1) Perform SVD on each expert weight matrix W to get right singular vectors V (columns are singular vectors), (2) Normalize the router vector to unit length, (3) Project the normalized router vector onto the top-k columns of V: proj = V[:, :k]ᵀ @ router_vec, (4) Sum the squared projections: align = Σ(proj²).
- Range: [0, 1]. Value of 1 means the router vector lies entirely in the top-k subspace. Value of 0 means it's orthogonal to that subspace.
- Higher values indicate: Stronger alignment - the router vector is well-aligned with the principal directions of the expert weight matrix.
- Z-score vs k: Shows the z-score of alignment compared to shuffled baselines.
- Formula: z(k) = (align(k) - shuffle_mean(k)) / shuffle_std(k)
- Interpretation: Measures how many standard deviations the actual alignment is above the shuffle baseline. This is a normalized measure of statistical significance.
- Computation: (1) Compute shuffle_mean and shuffle_std by shuffling router-expert assignments many times (typically 200) and computing alignment for each shuffle, (2) Calculate z-score = (actual_alignment - shuffle_mean) / shuffle_std.
- Interpretation thresholds: z > 2 indicates ~95% confidence, z > 3 indicates ~99.7% confidence that alignment exceeds chance. Values near 0 indicate alignment is consistent with random assignments.
- Higher values indicate: More statistically significant alignment above the shuffle baseline.
- Effect over Random vs k: Shows the effect size over theoretical random baseline.
- Formula: effect_over_random(k) = align(k) - (k / d_model)
- Interpretation: Measures how much the actual alignment exceeds the theoretical expectation if the router vector were randomly oriented in d_model-dimensional space. The baseline k/d_model is the expected projection energy onto a random k-dimensional subspace.
- Computation: (1) Calculate theoretical baseline: random_expect = k / d_model (where d_model is the model dimension, typically 4096), (2) Subtract from actual alignment: effect = align - random_expect.
- Baseline explanation: If a unit vector is randomly oriented in d_model dimensions, the expected projection energy onto any k-dimensional subspace is k/d_model. This is a simple analytical result from random matrix theory.
- Positive values indicate: Alignment exceeds theoretical random expectation. Negative values indicate alignment is below even random expectation (rare but possible).
- Limitation: This baseline assumes completely random orientation and doesn't account for the actual structure of router and expert vectors.
- Delta vs Shuffle vs k: Shows the difference between actual alignment and empirical shuffle baseline.
- Formula: delta(k) = align(k) - shuffle_mean(k)
- Interpretation: Measures the raw difference between actual alignment and the empirical mean from shuffled assignments. This preserves the actual structure of router and expert vectors but randomizes which router is assigned to which expert.
- Computation: (1) Perform many shuffles (typically 200): randomly permute which router vector is assigned to which expert, (2) For each shuffle, compute alignment using the shuffled assignments, (3) Calculate shuffle_mean = mean of all shuffle alignments, (4) Calculate delta = actual_alignment - shuffle_mean.
- Why it's more realistic: Unlike the theoretical baseline, this preserves the actual structure and magnitude of router and expert vectors. It only randomizes the assignment, making it a more appropriate null hypothesis for testing whether specific router-expert pairs are aligned.
- Positive values indicate: Actual alignment exceeds the empirical shuffle baseline, suggesting meaningful router-expert alignment beyond random assignment.
- Comparison to Effect over Random: Delta vs Shuffle is typically more conservative (smaller values) because shuffle_mean accounts for the actual vector structures, whereas k/d_model assumes completely random vectors.
Cos²(θ) Expert Comparison
Cos²(θ) Expert Comparison: This figure compares cos²(θ) values across experts and layers (for k=1 only):
- Left plot: Shows cos²(θ) per expert for k=1 (using only the top singular vector). Each line represents a different layer.
- Formula: cos²(θ) = (rᵀ · v₁)², where r is the normalized router vector and v₁ is the top right singular vector (first column of V from SVD).
- Interpretation: Measures the squared cosine of the angle between the router vector and the principal direction (top singular vector) of the expert weight matrix. This is a direct correlation measure indicating how well-aligned the router is with the expert's primary direction.
- Computation: (1) Perform SVD on expert weight matrix to get V (right singular vectors), (2) Extract v₁ = V[:, 0] (top singular vector), (3) Normalize router vector to unit length, (4) Compute cos²(θ) = (router_vecᵀ · v₁)².
- Relationship to alignment: For k=1, cos²(θ) = align(k=1). For k>1, align(k) = Σᵢ₌₁ᵏ cos²(θᵢ) where θᵢ is the angle with the i-th singular vector.
- Range: [0, 1]. Value of 1 means router is perfectly aligned with top singular vector. Value of 0 means router is orthogonal to it.
- Higher values indicate: Stronger alignment between router and expert's principal direction.
- Right plot: Shows mean cos²(θ) across all experts for each layer at k=1. Bar heights represent the average alignment strength per layer.
- Formula: mean_cos²(θ) = (1/n_experts) · Σᵢ cos²(θᵢ), where the sum is over all experts in the layer.
- Interpretation: Average alignment strength across all experts in a layer. Provides a layer-level summary of router-expert alignment.
- Computation: (1) Compute cos²(θ) for each expert at k=1, (2) Average across all experts in the layer.
- Use case: Compare alignment strength across different layers. Higher values indicate stronger overall alignment in that layer.
Shuffle Statistics
Shuffle Statistics: This figure shows statistics from shuffled baseline comparisons:
- Shuffle Mean vs k: Mean alignment value from shuffled router-expert assignments as a function of k.
- Formula: shuffle_mean(k) = (1/n_shuffles) · Σᵢ align_shuffled_i(k), where align_shuffled_i is the alignment computed with the i-th shuffled assignment.
- Interpretation: The expected alignment under the null hypothesis that router-expert assignments are random. This is the empirical baseline used for statistical comparison.
- Computation: (1) For each shuffle iteration (typically 200): randomly permute which router vector is assigned to which expert, (2) Compute alignment for each shuffled assignment using the same projection energy formula, (3) Average all shuffle alignments: shuffle_mean = mean(align_shuffled).
- Why it matters: This provides the null distribution mean. Actual alignment significantly above this suggests meaningful structure beyond random assignment.
- Typical behavior: Usually increases with k (more dimensions = higher projection energy), but typically lower than actual alignment when there's real structure.
- Shuffle Std vs k: Standard deviation of alignment values from shuffled assignments.
- Formula: shuffle_std(k) = std(align_shuffled(k)) = √[(1/(n-1)) · Σᵢ (align_shuffled_i - shuffle_mean)²]
- Interpretation: Measures the variability in alignment when router-expert assignments are randomized. Larger values indicate more uncertainty in the null distribution.
- Computation: (1) Compute alignments for all shuffle iterations, (2) Calculate standard deviation across all shuffle alignments.
- Use in z-score: Used as the denominator in z-score calculation: z = (align - shuffle_mean) / shuffle_std. Larger std means smaller z-scores for the same delta, making significance harder to achieve.
- Typical behavior: Usually increases with k, as higher-dimensional projections have more variability.
- Log Shuffle Std vs k: Logarithm of shuffle standard deviation.
- Formula: log_shuffle_std(k) = log(shuffle_std(k) + ε), where ε is a small constant (typically 1e-10) to avoid log(0).
- Interpretation: Logarithmic scale makes it easier to visualize exponential or power-law relationships in the standard deviation.
- Computation: (1) Compute shuffle_std as above, (2) Apply natural logarithm: log(std + ε).
- Why use log scale: If std grows exponentially with k, the log plot will show a linear relationship, making patterns easier to identify.
- Use case: Helps identify whether variability grows exponentially, linearly, or sub-linearly with k.
Z-score Decomposition
Z-score Decomposition: This figure breaks down the z-score calculation into its components:
- Delta vs k: The numerator of z-score.
- Formula: Δ(k) = align(k) - shuffle_mean(k)
- Interpretation: The raw difference between actual alignment and the shuffle baseline. This is the "effect size" before normalization.
- Computation: (1) Compute actual alignment for each k, (2) Compute shuffle_mean for each k (from shuffle statistics), (3) Calculate delta = align - shuffle_mean for each k.
- Units: Same as alignment (dimensionless, range [0, 1] for alignment, so delta can be negative or positive).
- Positive values indicate: Actual alignment exceeds shuffle baseline. Negative values indicate actual alignment is below shuffle baseline (rare but possible).
- Relationship to z-score: Delta is the numerator. Larger delta (with same std) leads to larger z-score.
- Shuffle Std vs k: The denominator of z-score.
- Formula: σ_shuffle(k) = std(align_shuffled(k))
- Interpretation: The variability in shuffled alignments. This is the same metric shown in Shuffle Statistics plot, but displayed here to show its role in z-score normalization.
- Computation: Standard deviation across all shuffle iterations for each k value. Same as described in Shuffle Statistics.
- Role in z-score: Acts as the normalization factor. Larger std means the same delta produces a smaller z-score, making it harder to achieve statistical significance.
- Why it matters: Understanding std helps interpret z-scores. A large delta with large std might have a moderate z-score, while a smaller delta with small std might have a large z-score.
- Z-score vs k: The final z-score.
- Formula: z(k) = Δ(k) / σ_shuffle(k) = (align(k) - shuffle_mean(k)) / shuffle_std(k)
- Interpretation: The number of standard deviations the actual alignment is above (or below) the shuffle baseline. This is a normalized measure of statistical significance.
- Computation: (1) Compute delta for each k, (2) Compute shuffle_std for each k, (3) Calculate z = delta / shuffle_std for each k.
- Statistical interpretation: Under the null hypothesis (random assignments), z follows approximately a standard normal distribution. z > 2 indicates ~95% confidence (p < 0.05), z > 3 indicates ~99.7% confidence (p < 0.003) that alignment exceeds chance.
- Advantages over delta: Normalized measure that accounts for variability. A delta of 0.1 might be significant if std=0.02 (z=5) but not if std=0.1 (z=1).
- Higher values indicate: More statistically significant alignment above the shuffle baseline.
Distribution Comparison
Distribution Comparison: This plot shows the probability distribution of shuffled alignments (projection energies) compared to the true alignment value:
- Solid curves: Approximate normal distribution of alignment values (projection energies) from shuffled router-expert assignments.
- Formula: P(align) ≈ N(μ_shuffle, σ²_shuffle), where μ_shuffle = shuffle_mean and σ_shuffle = shuffle_std.
- Interpretation: The probability distribution of what alignment values we would expect by chance when router vectors are randomly assigned to experts. This is the null distribution for statistical testing.
- Computation: (1) Compute shuffle_mean and shuffle_std from shuffle experiments, (2) Approximate the distribution as a normal distribution: N(shuffle_mean, shuffle_std²), (3) Plot the probability density function using scipy.stats.norm.pdf(x, shuffle_mean, shuffle_std).
- Why normal distribution: By the Central Limit Theorem, the mean of many independent shuffle alignments approximates a normal distribution, especially with 200 shuffles.
- What it shows: The range and likelihood of alignment values under the null hypothesis. The peak is at shuffle_mean, and the width is determined by shuffle_std.
- Dashed vertical lines: The actual alignment value (projection energy) for each run.
- Formula: true_align(k) = (1/n_experts) · Σᵢ align_i(k), where align_i is the alignment for expert i at k.
- Interpretation: The observed mean alignment across all experts for the given k value. This is what we're testing against the null distribution.
- Computation: (1) Compute alignment for each expert at the given k value, (2) Average across all experts: true_align = mean(align_expert).
- Position relative to distribution: If the line is far to the right of the distribution peak (shuffle_mean), it indicates strong alignment above chance. The distance from the peak, measured in standard deviations, corresponds to the z-score.
- Statistical interpretation: If the line falls in the right tail of the distribution (beyond ~2σ), it suggests the alignment is statistically significant (p < 0.05).
- Multiple runs: Each run gets its own dashed line, allowing comparison of alignment strength across different layers or configurations.
Per-Expert Breakdown
Per-Expert Breakdown: This figure provides detailed expert-level analysis:
- Alignment Heatmap (top left): Shows alignment values (projection energies) for each expert (rows) across different k values (columns).
- Formula: heatmap[expert, k] = mean(align_expert,k), averaged across any multiple runs if present.
- Interpretation: Visual representation of how alignment strength varies across experts and k values. Each cell shows the mean alignment for a specific expert-k combination.
- Computation: (1) Group data by expert and k, (2) Average alignment values within each group, (3) Create pivot table: pivot = df.pivot_table(values='align', index='expert', columns='k', aggfunc='mean'), (4) Display as heatmap with color intensity proportional to alignment value.
- Color scheme: Warmer colors (yellow/green) indicate stronger alignment, cooler colors (blue/purple) indicate weaker alignment. Uses 'viridis' colormap.
- What to look for: Patterns across experts (rows) show which experts have consistently high/low alignment. Patterns across k (columns) show how alignment changes with dimensionality.
- Use case: Identify experts with particularly strong or weak alignment, and see how alignment scales with k for each expert.
- Delta Heatmap (top right): Shows delta (alignment - shuffle_mean) for each expert across k values.
- Formula: heatmap[expert, k] = mean(align_expert,k - shuffle_mean_k), averaged across runs if present.
- Interpretation: Shows how much each expert's alignment exceeds (or falls below) the shuffle baseline at each k value.
- Computation: (1) Compute delta for each expert-k combination: delta = align - shuffle_mean, (2) Create pivot table: pivot = df.pivot_table(values='delta_vs_shuffle', index='expert', columns='k', aggfunc='mean'), (3) Display as heatmap with colormap centered at zero.
- Color scheme: Red indicates positive delta (above shuffle baseline), blue indicates negative delta (below shuffle baseline). Uses 'RdBu_r' (Red-Blue reversed) colormap, centered at zero using vmin=-max_abs, vmax=max_abs.
- What to look for: Experts with consistently red cells have strong alignment above baseline. Experts with blue cells have alignment below baseline (rare but possible).
- Advantage over alignment heatmap: Normalized by shuffle baseline, making it easier to see which experts truly exceed chance expectations.
- Alignment vs Expert (bottom left): Scatter plot showing alignment values (projection energies) for each expert at a fixed k (typically k=128).
- Formula: For each expert i: align_i(k_fixed), where k_fixed is typically 128 or the median k value if 128 is not available.
- Interpretation: Shows the distribution of alignment strengths across experts at a representative k value. Each point represents one expert's alignment.
- Computation: (1) Select a representative k value (prefer k=128, fallback to median k), (2) Filter data: k_data = df[df['k'] == k_fixed], (3) Extract alignment values: align_vals = k_data['align'].values, expert_vals = k_data['expert'].values, (4) Plot as scatter: scatter(expert_vals, align_vals).
- Why scatter plot: Shows individual expert values rather than averages, revealing variability and outliers.
- What to look for: Experts with particularly high or low alignment values. Clustering of points suggests similar alignment strengths across experts.
- Multiple runs: If comparing multiple runs, each run gets a different color/marker, allowing comparison of alignment patterns across layers or configurations.
- Delta vs Expert (bottom right): Scatter plot showing delta values for each expert at a fixed k.
- Formula: For each expert i: delta_i(k_fixed) = align_i(k_fixed) - shuffle_mean(k_fixed).
- Interpretation: Shows how much each expert's alignment exceeds (or falls below) the shuffle baseline at a representative k value.
- Computation: (1) Use the same k_fixed as in Alignment vs Expert plot, (2) Filter data: k_data = df[df['k'] == k_fixed], (3) Extract delta values: delta_vals = k_data['delta_vs_shuffle'].values, expert_vals = k_data['expert'].values, (4) Plot as scatter: scatter(expert_vals, delta_vals), (5) Add horizontal line at y=0 for reference.
- Reference line: The horizontal dashed line at y=0 separates experts above baseline (positive delta) from those below baseline (negative delta).
- What to look for: Experts with delta significantly above zero have strong alignment. Most experts should have positive delta if there's meaningful structure. Negative delta is rare but indicates alignment below even random assignment.
- Advantage over Alignment vs Expert: Normalized by shuffle baseline, making it easier to identify experts with statistically meaningful alignment.
- Multiple runs: If comparing multiple runs, each run gets different markers, showing how delta patterns vary across layers or configurations.
Complete Analysis Visualization
Complete Analysis Visualization: This comprehensive figure contains 12 subplots showing all key metrics for a single analysis run:
- Row 1: Alignment, Z-score, and Effect over Random vs k (log scale)
- Row 2: Delta vs Shuffle, Shuffle Mean, and Shuffle Std vs k
- Row 3: Heatmaps showing Alignment, Delta, and Z-score across experts (rows) and k values (columns)
- Row 4: Cos²(θ) per expert (k=1), Alignment per expert, and Z-score per expert at a representative k value
All metrics are computed as described in the individual plot explanations above.
1. Comparison Across All Runs
This section compares all 6 result files side by side.
Comparison Plots
See "Plot Explanations" section at the top of this report for detailed information about this plot.
Cos²(θ) Expert Comparison
See "Plot Explanations" section at the top of this report for detailed information about this plot.
Diagnostic Plots (Comparison)
Diagnostic plots comparing all runs to understand differences in alignment, z-score, and delta.
Shuffle Statistics
See "Plot Explanations" section at the top of this report for detailed information about this plot.
Z-score Decomposition
See "Plot Explanations" section at the top of this report for detailed information about this plot.
Distribution Comparison (k=32)
See "Plot Explanations" section at the top of this report for detailed information about this plot.
Distribution Comparison (k=128)
See "Plot Explanations" section at the top of this report for detailed information about this plot.
Distribution Comparison (k=512)
See "Plot Explanations" section at the top of this report for detailed information about this plot.
Distribution Comparison (k=2048)
See "Plot Explanations" section at the top of this report for detailed information about this plot.
Per-Expert Breakdown
See "Plot Explanations" section at the top of this report for detailed information about this plot.
2. Individual Analysis - Run 1
Setup and Configuration
Summary Statistics (averaged across experts)
| k |
align |
delta_vs_shuffle |
z_vs_shuffle |
effect_over_random |
cos_squared |
| 1 |
0.095886 |
0.083623 |
2.767175 |
0.095642 |
0.095886 |
| 2 |
0.096110 |
0.081791 |
2.544904 |
0.095622 |
0.000000 |
| 4 |
0.096928 |
0.083131 |
2.722896 |
0.095951 |
0.000000 |
| 8 |
0.105326 |
0.085136 |
2.479962 |
0.103373 |
0.000000 |
| 16 |
0.122886 |
0.091767 |
2.635744 |
0.118980 |
0.000000 |
| 32 |
0.145117 |
0.090595 |
2.689427 |
0.137305 |
0.000000 |
| 64 |
0.173137 |
0.080351 |
2.325970 |
0.157512 |
0.000000 |
| 128 |
0.228232 |
0.083211 |
2.313799 |
0.196982 |
0.000000 |
| 256 |
0.308274 |
0.092401 |
2.296443 |
0.245774 |
0.000000 |
| 512 |
0.429478 |
0.095819 |
1.903831 |
0.304478 |
0.000000 |
| 1024 |
0.594914 |
0.085419 |
1.779300 |
0.344914 |
0.000000 |
| 2048 |
0.782698 |
0.052253 |
1.832639 |
0.282698 |
0.000000 |
| 4096 |
1.000000 |
0.000000 |
0.000000 |
0.000000 |
0.000000 |
Cos²(θ) Alignment (k=1)
Mean cos²(θ): 0.095886
Max cos²(θ): 0.113289
Min cos²(θ): 0.081037
Std cos²(θ): 0.011979
Per-expert cos²(θ) values:
| Expert | cos²(θ) | align |
| 0 |
0.087693 |
0.087692 |
| 1 |
0.088284 |
0.088284 |
| 2 |
0.101300 |
0.101300 |
| 3 |
0.111540 |
0.111540 |
| 4 |
0.097033 |
0.097033 |
| 5 |
0.113289 |
0.113289 |
| 6 |
0.081037 |
0.081037 |
| 7 |
0.086911 |
0.086911 |
Detailed Results by K Value
K = 1:
- Mean align: 0.095886
- Mean z-score: 2.77
- Mean effect over random: 0.095642
K = 2:
- Mean align: 0.096110
- Mean z-score: 2.54
- Mean effect over random: 0.095622
K = 4:
- Mean align: 0.096928
- Mean z-score: 2.72
- Mean effect over random: 0.095951
K = 8:
- Mean align: 0.105326
- Mean z-score: 2.48
- Mean effect over random: 0.103373
K = 16:
- Mean align: 0.122886
- Mean z-score: 2.64
- Mean effect over random: 0.118980
K = 32:
- Mean align: 0.145117
- Mean z-score: 2.69
- Mean effect over random: 0.137305
K = 64:
- Mean align: 0.173137
- Mean z-score: 2.33
- Mean effect over random: 0.157512
K = 128:
- Mean align: 0.228232
- Mean z-score: 2.31
- Mean effect over random: 0.196982
K = 256:
- Mean align: 0.308274
- Mean z-score: 2.30
- Mean effect over random: 0.245774
K = 512:
- Mean align: 0.429478
- Mean z-score: 1.90
- Mean effect over random: 0.304478
K = 1024:
- Mean align: 0.594914
- Mean z-score: 1.78
- Mean effect over random: 0.344914
K = 2048:
- Mean align: 0.782698
- Mean z-score: 1.83
- Mean effect over random: 0.282698
K = 4096:
- Mean align: 1.000000
- Mean z-score: 0.00
- Mean effect over random: 0.000000
Complete Analysis Plots
Comprehensive visualization of all metrics for this run.
Complete Analysis Visualization
See "Plot Explanations" section at the top of this report for detailed information about this plot.
3. Individual Analysis - Run 2
Setup and Configuration
Summary Statistics (averaged across experts)
| k |
align |
delta_vs_shuffle |
z_vs_shuffle |
effect_over_random |
cos_squared |
| 1 |
0.072366 |
0.063612 |
2.790334 |
0.072121 |
0.072366 |
| 2 |
0.072696 |
0.062340 |
2.555272 |
0.072208 |
0.000000 |
| 4 |
0.074890 |
0.064735 |
2.673774 |
0.073913 |
0.000000 |
| 8 |
0.076754 |
0.064113 |
2.463954 |
0.074800 |
0.000000 |
| 16 |
0.082027 |
0.068644 |
2.692019 |
0.078121 |
0.000000 |
| 32 |
0.100164 |
0.082604 |
2.779862 |
0.092352 |
0.000000 |
| 64 |
0.117190 |
0.091304 |
2.653731 |
0.101565 |
0.000000 |
| 128 |
0.158847 |
0.116712 |
2.598195 |
0.127597 |
0.000000 |
| 256 |
0.215059 |
0.145011 |
2.727091 |
0.152559 |
0.000000 |
| 512 |
0.307792 |
0.183919 |
2.608082 |
0.182792 |
0.000000 |
| 1024 |
0.464653 |
0.234680 |
2.436636 |
0.214653 |
0.000000 |
| 2048 |
0.682809 |
0.239296 |
2.770397 |
0.182809 |
0.000000 |
| 4096 |
1.000000 |
0.000000 |
0.000000 |
0.000000 |
0.000000 |
Cos²(θ) Alignment (k=1)
Mean cos²(θ): 0.072366
Max cos²(θ): 0.082854
Min cos²(θ): 0.059842
Std cos²(θ): 0.009443
Per-expert cos²(θ) values:
| Expert | cos²(θ) | align |
| 0 |
0.076257 |
0.076257 |
| 1 |
0.080060 |
0.080060 |
| 2 |
0.059842 |
0.059842 |
| 3 |
0.082841 |
0.082841 |
| 4 |
0.063383 |
0.063383 |
| 5 |
0.062867 |
0.062867 |
| 6 |
0.082854 |
0.082854 |
| 7 |
0.070820 |
0.070820 |
Detailed Results by K Value
K = 1:
- Mean align: 0.072366
- Mean z-score: 2.79
- Mean effect over random: 0.072121
K = 2:
- Mean align: 0.072696
- Mean z-score: 2.56
- Mean effect over random: 0.072208
K = 4:
- Mean align: 0.074890
- Mean z-score: 2.67
- Mean effect over random: 0.073913
K = 8:
- Mean align: 0.076754
- Mean z-score: 2.46
- Mean effect over random: 0.074800
K = 16:
- Mean align: 0.082027
- Mean z-score: 2.69
- Mean effect over random: 0.078121
K = 32:
- Mean align: 0.100164
- Mean z-score: 2.78
- Mean effect over random: 0.092352
K = 64:
- Mean align: 0.117190
- Mean z-score: 2.65
- Mean effect over random: 0.101565
K = 128:
- Mean align: 0.158847
- Mean z-score: 2.60
- Mean effect over random: 0.127597
K = 256:
- Mean align: 0.215059
- Mean z-score: 2.73
- Mean effect over random: 0.152559
K = 512:
- Mean align: 0.307792
- Mean z-score: 2.61
- Mean effect over random: 0.182792
K = 1024:
- Mean align: 0.464653
- Mean z-score: 2.44
- Mean effect over random: 0.214653
K = 2048:
- Mean align: 0.682809
- Mean z-score: 2.77
- Mean effect over random: 0.182809
K = 4096:
- Mean align: 1.000000
- Mean z-score: 0.00
- Mean effect over random: 0.000000
Complete Analysis Plots
Comprehensive visualization of all metrics for this run.
Complete Analysis Visualization
See "Plot Explanations" section at the top of this report for detailed information about this plot.
4. Individual Analysis - Run 3
Setup and Configuration
Summary Statistics (averaged across experts)
| k |
align |
delta_vs_shuffle |
z_vs_shuffle |
effect_over_random |
cos_squared |
| 1 |
0.071958 |
0.060979 |
2.712114 |
0.071714 |
0.071958 |
| 2 |
0.083511 |
0.068067 |
2.440966 |
0.083023 |
0.000000 |
| 4 |
0.103484 |
0.083800 |
2.616887 |
0.102507 |
0.000000 |
| 8 |
0.118817 |
0.088900 |
2.314535 |
0.116864 |
0.000000 |
| 16 |
0.149789 |
0.108853 |
2.532124 |
0.145882 |
0.000000 |
| 32 |
0.183115 |
0.126321 |
2.636953 |
0.175302 |
0.000000 |
| 64 |
0.227703 |
0.143941 |
2.468127 |
0.212078 |
0.000000 |
| 128 |
0.287316 |
0.169768 |
2.421234 |
0.256066 |
0.000000 |
| 256 |
0.361914 |
0.201321 |
2.540398 |
0.299414 |
0.000000 |
| 512 |
0.448173 |
0.220829 |
2.405310 |
0.323173 |
0.000000 |
| 1024 |
0.556532 |
0.223693 |
2.230740 |
0.306532 |
0.000000 |
| 2048 |
0.720768 |
0.205407 |
2.433975 |
0.220768 |
0.000000 |
| 4096 |
1.000000 |
0.000000 |
0.000000 |
0.000000 |
0.000000 |
Cos²(θ) Alignment (k=1)
Mean cos²(θ): 0.071958
Max cos²(θ): 0.106940
Min cos²(θ): 0.056461
Std cos²(θ): 0.016288
Per-expert cos²(θ) values:
| Expert | cos²(θ) | align |
| 0 |
0.106940 |
0.106940 |
| 1 |
0.078474 |
0.078474 |
| 2 |
0.064948 |
0.064948 |
| 3 |
0.074671 |
0.074671 |
| 4 |
0.056461 |
0.056461 |
| 5 |
0.074265 |
0.074265 |
| 6 |
0.060473 |
0.060473 |
| 7 |
0.059433 |
0.059433 |
Detailed Results by K Value
K = 1:
- Mean align: 0.071958
- Mean z-score: 2.71
- Mean effect over random: 0.071714
K = 2:
- Mean align: 0.083511
- Mean z-score: 2.44
- Mean effect over random: 0.083023
K = 4:
- Mean align: 0.103484
- Mean z-score: 2.62
- Mean effect over random: 0.102507
K = 8:
- Mean align: 0.118817
- Mean z-score: 2.31
- Mean effect over random: 0.116864
K = 16:
- Mean align: 0.149789
- Mean z-score: 2.53
- Mean effect over random: 0.145882
K = 32:
- Mean align: 0.183115
- Mean z-score: 2.64
- Mean effect over random: 0.175302
K = 64:
- Mean align: 0.227703
- Mean z-score: 2.47
- Mean effect over random: 0.212078
K = 128:
- Mean align: 0.287316
- Mean z-score: 2.42
- Mean effect over random: 0.256066
K = 256:
- Mean align: 0.361914
- Mean z-score: 2.54
- Mean effect over random: 0.299414
K = 512:
- Mean align: 0.448173
- Mean z-score: 2.41
- Mean effect over random: 0.323173
K = 1024:
- Mean align: 0.556532
- Mean z-score: 2.23
- Mean effect over random: 0.306532
K = 2048:
- Mean align: 0.720768
- Mean z-score: 2.43
- Mean effect over random: 0.220768
K = 4096:
- Mean align: 1.000000
- Mean z-score: 0.00
- Mean effect over random: 0.000000
Complete Analysis Plots
Comprehensive visualization of all metrics for this run.
Complete Analysis Visualization
See "Plot Explanations" section at the top of this report for detailed information about this plot.
5. Individual Analysis - Run 4
Setup and Configuration
Summary Statistics (averaged across experts)
| k |
align |
delta_vs_shuffle |
z_vs_shuffle |
effect_over_random |
cos_squared |
| 1 |
0.073495 |
0.061777 |
2.535697 |
0.073251 |
0.073495 |
| 2 |
0.085189 |
0.068171 |
2.394850 |
0.084701 |
0.000000 |
| 4 |
0.098532 |
0.075852 |
2.536124 |
0.097556 |
0.000000 |
| 8 |
0.115838 |
0.083475 |
2.319364 |
0.113885 |
0.000000 |
| 16 |
0.145283 |
0.100287 |
2.448533 |
0.141377 |
0.000000 |
| 32 |
0.180897 |
0.119697 |
2.637595 |
0.173085 |
0.000000 |
| 64 |
0.217590 |
0.130625 |
2.390451 |
0.201965 |
0.000000 |
| 128 |
0.263065 |
0.145904 |
2.430203 |
0.231815 |
0.000000 |
| 256 |
0.325066 |
0.167764 |
2.510164 |
0.262566 |
0.000000 |
| 512 |
0.399251 |
0.177589 |
2.334718 |
0.274251 |
0.000000 |
| 1024 |
0.505917 |
0.180667 |
2.177440 |
0.255917 |
0.000000 |
| 2048 |
0.672919 |
0.161160 |
2.253222 |
0.172919 |
0.000000 |
| 4096 |
1.000000 |
0.000000 |
0.000000 |
0.000000 |
0.000000 |
Cos²(θ) Alignment (k=1)
Mean cos²(θ): 0.073495
Max cos²(θ): 0.111711
Min cos²(θ): 0.051174
Std cos²(θ): 0.019774
Per-expert cos²(θ) values:
| Expert | cos²(θ) | align |
| 0 |
0.068715 |
0.068715 |
| 1 |
0.055104 |
0.055104 |
| 2 |
0.072262 |
0.072262 |
| 3 |
0.051174 |
0.051174 |
| 4 |
0.111711 |
0.111711 |
| 5 |
0.086037 |
0.086037 |
| 6 |
0.060293 |
0.060293 |
| 7 |
0.082666 |
0.082666 |
Detailed Results by K Value
K = 1:
- Mean align: 0.073495
- Mean z-score: 2.54
- Mean effect over random: 0.073251
K = 2:
- Mean align: 0.085189
- Mean z-score: 2.39
- Mean effect over random: 0.084701
K = 4:
- Mean align: 0.098532
- Mean z-score: 2.54
- Mean effect over random: 0.097556
K = 8:
- Mean align: 0.115838
- Mean z-score: 2.32
- Mean effect over random: 0.113885
K = 16:
- Mean align: 0.145283
- Mean z-score: 2.45
- Mean effect over random: 0.141377
K = 32:
- Mean align: 0.180897
- Mean z-score: 2.64
- Mean effect over random: 0.173085
K = 64:
- Mean align: 0.217590
- Mean z-score: 2.39
- Mean effect over random: 0.201965
K = 128:
- Mean align: 0.263065
- Mean z-score: 2.43
- Mean effect over random: 0.231815
K = 256:
- Mean align: 0.325066
- Mean z-score: 2.51
- Mean effect over random: 0.262566
K = 512:
- Mean align: 0.399251
- Mean z-score: 2.33
- Mean effect over random: 0.274251
K = 1024:
- Mean align: 0.505917
- Mean z-score: 2.18
- Mean effect over random: 0.255917
K = 2048:
- Mean align: 0.672919
- Mean z-score: 2.25
- Mean effect over random: 0.172919
K = 4096:
- Mean align: 1.000000
- Mean z-score: 0.00
- Mean effect over random: 0.000000
Complete Analysis Plots
Comprehensive visualization of all metrics for this run.
Complete Analysis Visualization
See "Plot Explanations" section at the top of this report for detailed information about this plot.
6. Individual Analysis - Run 5
Setup and Configuration
Summary Statistics (averaged across experts)
| k |
align |
delta_vs_shuffle |
z_vs_shuffle |
effect_over_random |
cos_squared |
| 1 |
0.057551 |
0.047232 |
2.554473 |
0.057307 |
0.057551 |
| 2 |
0.059381 |
0.047484 |
2.482403 |
0.058893 |
0.000000 |
| 4 |
0.068402 |
0.054134 |
2.553551 |
0.067426 |
0.000000 |
| 8 |
0.075794 |
0.054138 |
2.426632 |
0.073841 |
0.000000 |
| 16 |
0.104669 |
0.068705 |
2.044184 |
0.100762 |
0.000000 |
| 32 |
0.139227 |
0.069573 |
1.228228 |
0.131414 |
0.000000 |
| 64 |
0.188318 |
0.063255 |
0.907765 |
0.172693 |
0.000000 |
| 128 |
0.264706 |
0.052687 |
0.590516 |
0.233456 |
0.000000 |
| 256 |
0.416913 |
0.053419 |
0.453827 |
0.354413 |
0.000000 |
| 512 |
0.611965 |
0.060646 |
0.445669 |
0.486965 |
0.000000 |
| 1024 |
0.767452 |
0.057432 |
0.421641 |
0.517452 |
0.000000 |
| 2048 |
0.885113 |
0.037467 |
0.398855 |
0.385113 |
0.000000 |
| 4096 |
1.000000 |
0.000000 |
0.000000 |
0.000000 |
0.000000 |
Cos²(θ) Alignment (k=1)
Mean cos²(θ): 0.057551
Max cos²(θ): 0.070422
Min cos²(θ): 0.042068
Std cos²(θ): 0.011621
Per-expert cos²(θ) values:
| Expert | cos²(θ) | align |
| 0 |
0.054994 |
0.054994 |
| 1 |
0.070422 |
0.070422 |
| 2 |
0.059095 |
0.059095 |
| 3 |
0.042068 |
0.042068 |
| 4 |
0.068530 |
0.068530 |
| 5 |
0.053046 |
0.053046 |
| 6 |
0.070022 |
0.070022 |
| 7 |
0.042234 |
0.042234 |
Detailed Results by K Value
K = 1:
- Mean align: 0.057551
- Mean z-score: 2.55
- Mean effect over random: 0.057307
K = 2:
- Mean align: 0.059381
- Mean z-score: 2.48
- Mean effect over random: 0.058893
K = 4:
- Mean align: 0.068402
- Mean z-score: 2.55
- Mean effect over random: 0.067426
K = 8:
- Mean align: 0.075794
- Mean z-score: 2.43
- Mean effect over random: 0.073841
K = 16:
- Mean align: 0.104669
- Mean z-score: 2.04
- Mean effect over random: 0.100762
K = 32:
- Mean align: 0.139227
- Mean z-score: 1.23
- Mean effect over random: 0.131414
K = 64:
- Mean align: 0.188318
- Mean z-score: 0.91
- Mean effect over random: 0.172693
K = 128:
- Mean align: 0.264706
- Mean z-score: 0.59
- Mean effect over random: 0.233456
K = 256:
- Mean align: 0.416913
- Mean z-score: 0.45
- Mean effect over random: 0.354413
K = 512:
- Mean align: 0.611965
- Mean z-score: 0.45
- Mean effect over random: 0.486965
K = 1024:
- Mean align: 0.767452
- Mean z-score: 0.42
- Mean effect over random: 0.517452
K = 2048:
- Mean align: 0.885113
- Mean z-score: 0.40
- Mean effect over random: 0.385113
K = 4096:
- Mean align: 1.000000
- Mean z-score: 0.00
- Mean effect over random: 0.000000
Complete Analysis Plots
Comprehensive visualization of all metrics for this run.
Complete Analysis Visualization
See "Plot Explanations" section at the top of this report for detailed information about this plot.
7. Individual Analysis - Run 6
Setup and Configuration
Summary Statistics (averaged across experts)
| k |
align |
delta_vs_shuffle |
z_vs_shuffle |
effect_over_random |
cos_squared |
| 1 |
0.096284 |
0.083204 |
2.714199 |
0.096039 |
0.096284 |
| 2 |
0.104348 |
0.087288 |
2.481372 |
0.103860 |
0.000000 |
| 4 |
0.118186 |
0.097865 |
2.627248 |
0.117210 |
0.000000 |
| 8 |
0.131431 |
0.098892 |
2.322079 |
0.129478 |
0.000000 |
| 16 |
0.141373 |
0.098440 |
2.420509 |
0.137467 |
0.000000 |
| 32 |
0.157609 |
0.093518 |
2.438773 |
0.149797 |
0.000000 |
| 64 |
0.186806 |
0.090480 |
2.143955 |
0.171181 |
0.000000 |
| 128 |
0.234749 |
0.082012 |
1.664156 |
0.203499 |
0.000000 |
| 256 |
0.328797 |
0.084129 |
1.319956 |
0.266297 |
0.000000 |
| 512 |
0.469418 |
0.083825 |
1.143750 |
0.344418 |
0.000000 |
| 1024 |
0.639959 |
0.070529 |
1.139612 |
0.389959 |
0.000000 |
| 2048 |
0.817105 |
0.041170 |
1.323049 |
0.317105 |
0.000000 |
| 4096 |
1.000000 |
0.000000 |
0.000000 |
0.000000 |
0.000000 |
Cos²(θ) Alignment (k=1)
Mean cos²(θ): 0.096284
Max cos²(θ): 0.112351
Min cos²(θ): 0.079784
Std cos²(θ): 0.012235
Per-expert cos²(θ) values:
| Expert | cos²(θ) | align |
| 0 |
0.088580 |
0.088580 |
| 1 |
0.108539 |
0.108539 |
| 2 |
0.084463 |
0.084463 |
| 3 |
0.079784 |
0.079784 |
| 4 |
0.101385 |
0.101385 |
| 5 |
0.089250 |
0.089250 |
| 6 |
0.112351 |
0.112351 |
| 7 |
0.105917 |
0.105917 |
Detailed Results by K Value
K = 1:
- Mean align: 0.096284
- Mean z-score: 2.71
- Mean effect over random: 0.096039
K = 2:
- Mean align: 0.104348
- Mean z-score: 2.48
- Mean effect over random: 0.103860
K = 4:
- Mean align: 0.118186
- Mean z-score: 2.63
- Mean effect over random: 0.117210
K = 8:
- Mean align: 0.131431
- Mean z-score: 2.32
- Mean effect over random: 0.129478
K = 16:
- Mean align: 0.141373
- Mean z-score: 2.42
- Mean effect over random: 0.137467
K = 32:
- Mean align: 0.157609
- Mean z-score: 2.44
- Mean effect over random: 0.149797
K = 64:
- Mean align: 0.186806
- Mean z-score: 2.14
- Mean effect over random: 0.171181
K = 128:
- Mean align: 0.234749
- Mean z-score: 1.66
- Mean effect over random: 0.203499
K = 256:
- Mean align: 0.328797
- Mean z-score: 1.32
- Mean effect over random: 0.266297
K = 512:
- Mean align: 0.469418
- Mean z-score: 1.14
- Mean effect over random: 0.344418
K = 1024:
- Mean align: 0.639959
- Mean z-score: 1.14
- Mean effect over random: 0.389959
K = 2048:
- Mean align: 0.817105
- Mean z-score: 1.32
- Mean effect over random: 0.317105
K = 4096:
- Mean align: 1.000000
- Mean z-score: 0.00
- Mean effect over random: 0.000000
Complete Analysis Plots
Comprehensive visualization of all metrics for this run.
Complete Analysis Visualization
See "Plot Explanations" section at the top of this report for detailed information about this plot.